Distribution-free and model-free multivariate feature screening via multivariate rank distance correlation
نویسندگان
چکیده
Feature screening approaches are effective in selecting active features from data with ultrahigh dimensionality and increasing complexity; however, the majority of existing feature either restricted to a univariate response or rely on some distribution model assumptions. In this article, we propose novel sure independence approach based multivariate rank distance correlation (MrDc-SIS). The MrDc-SIS achieves multiple desirable properties such as being distribution-free, completely nonparametric, scale-free, robust for outliers heavy tails, sensitive hidden structures. Moreover, can be used screen responses one dimensional multi-dimensional predictors. We establish asymptotic consistency property under mild condition by lifting previous assumptions about finite moments. Simulation studies demonstrate that outperforms three other closely relevant various settings. also apply multi-omics ovarian carcinoma downloaded Cancer Genome Atlas (TCGA).
منابع مشابه
A Distribution-Free Multivariate Control Chart
Monitoring multivariate quality variables or data streams remains an important and challenging problem in statistical process control (SPC). Although the multivariate SPC has been extensively studied in the literature, designing distribution-free control schemes are still challenging and yet to be addressed well. This paper develops a new nonparametric methodology for monitoring location parame...
متن کاملMultivariate normal distribution - Wikipedia, the free encyclopedia
1 General case 1.1 Cumulative distribution function 1.2 A counterexample 1.3 Normally distributed and independent 2 Bivariate case 3 Affine transformation 4 Geometric interpretation 5 Correlations and independence 6 Higher moments 7 Conditional distributions 8 Fisher information matrix 9 Kullback-Leibler divergence 10 Estimation of parameters 11 Entropy 12 Multivariate normality tests 13 Drawin...
متن کاملFeature Screening via Distance Correlation Learning.
This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance correlation (DC-SIS, for short). The DC-SIS can be implemented as easily as the sure independence screening procedure based on the Pearson correlation (SIS, for short...
متن کاملDistribution Free Decomposition of Multivariate Data SPR'98 Invited submission
We present a practical approach to nonparametric cluster analysis of large data sets. The number of clusters and the cluster centers are automatically derived by mode seeking with the mean shift procedure on a reduced set of points randomly selected from the data. The cluster boundaries are delineated using a k-nearest neighbor technique. The proposed algorithm is stable and e cient, a 10000 po...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Multivariate Analysis
سال: 2022
ISSN: ['0047-259X', '1095-7243']
DOI: https://doi.org/10.1016/j.jmva.2022.105081